An Evaluation through Simulation of Electrolarynx Control based on Statistical F0 Prediction for Multiple Speakers
نویسندگان
چکیده
An electrolarynx is a device that artificially generates excitation sounds to produce electrolaryngeal (EL) speech. Although proficient laryngectomees can produce intelligible EL speech by using this device, it sounds quite unnatural due to the mechanical excitation. To address this issue, we have proposed several EL speech enhancement methods using statistical voice conversion and showed that statistical prediction of excitation parameters, such as F0 patterns, was essential to significantly improve naturalness of EL speech. Based on this result, we have also proposed a direct control method of F0 patterns of excitation sounds generated from electrolarynx based on the statistical excitation prediction as an EL speech enhancement method applicable to face-to-face conversation. In our previous work, this direct control method was evaluated through its simulation using only a single laryngectomee’s EL speech and it was demonstrated that this method enables to improve naturalness of EL speech while preserving its listenability. However, because quality of EL speech highly depends on proficiency of each laryngectomee, it is necessary to apply this method to other laryngectomees and evaluate its effectiveness. In addition, we need to evaluate this method from more various perspectives, such as not only naturalness and listenability but also intelligibility. In this paper, we apply the direct control method to multiple speakers consisting of two real laryngectomees and one non-laryngectomee and evaluate its performance through simulations in terms of naturalness, listenability, and intelligibility. The experimental results demonstrate that the proposed method yields significant improvements in naturalness of EL speech for multiple laryngectomees while keeping its listenability and intelligibility high enough.
منابع مشابه
An inter-speaker evaluation through simulation of electrolarynx control based on statistical F0 prediction
An electrolarynx is a device that artificially generates excitation sounds to produce electrolaryngeal (EL) speech. Although proficient laryngectomees can produce intelligible EL speech by using this device, it sounds quite unnatural due to the mechanical excitation. To address this issue, we have proposed several EL speech enhancement methods using statistical voice conversion and showed that ...
متن کاملDirect F0 control of an electrolarynx based on statistical excitation feature prediction and its evaluation through simulation
An electrolarynx is a device that artificially generates excitation sounds to enable laryngectomees to produce electrolaryngeal (EL) speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produced by the device. To address this issue, we have proposed several EL speech enhancement methods using statistical v...
متن کاملA Vibration Control Method of an Electrolarynx Based on Statistical F0 Pattern Prediction
This paper presents a novel speaking aid system to help laryngectomees produce more naturally sounding electrolaryngeal (EL) speech. An electrolarynx is an external device to generate excitation signals, instead of vibration of the vocal folds. Although the conventional EL speech is quite intelligible, its naturalness suffers from the unnatural fundamental frequency (F0) patterns of the mechani...
متن کاملReal-time vibration control of an electrolarynx based on statistical F0 contour prediction
An electrolarynx is a speaking aid device to artificially generate excitation sounds to help laryngectomees produce electrolaryngeal (EL) speech. Although EL speech is quite intelligible, its naturalness significantly suffers from the unnatural fundamental frequency (F0) patterns of the mechanical excitation sounds. To make it possible to produce more naturally sounding EL speech, we have propo...
متن کاملPhysically Constrained Statistical F0 Prediction for Electrolaryngeal Speech Enhancement
Electrolaryngeal (EL) speech produced by a laryngectomee using an electrolarynx to mechanically generate artificial excitation sounds severely suffers from unnatural fundamental frequency (F0) patterns caused by monotonic excitation sounds. To address this issue, we have previously proposed EL speech enhancement systems using statistical F0 pattern prediction methods based on a Gaussian Mixture...
متن کامل